librsb
A Shared Memory Parallel Sparse Matrix computations Library for the Recursive Sparse Blocks Format
librsb
is a library for sparse matrix computations implementing
the (RSB) matrix format
for multicore-parallel, shared-memory systems, via OpenMP.
It provides the most common operations for iterative solvers, e.g.:
-
sparse matrix-dense matrix multiplication
(aka SpMM: C ← β C + α A B, ...);
if dense matrix is a vector: (aka SpMV: y ← β y + α A x, ...)
-
triangular solution
(y ← T-1x, ...)
-
rows scaling
(A i,: ← α A i,:),
columns scaling
(A :,j ← α A :,j)
-
diagonal extraction
(di ← Ai,i, ∀ i=1..N),
blocks extraction
(B1:m,1:k ← Ai:i+m-1,j:j+k-1),
norm computation
( |A|1, |A|2, |A|∞),
-
format interoperability with
CSR,
CSC,
COO.
The RSB format is especially well suited for
symmetric matrices
(A ⩵ AT)
and transposed
(y ← β y + α AT B)
multiplication
variants.
librsb
also implements the Sparse BLAS standard, as specified in the [BLAS Technical Forum] documents.
You might be interested in:
The
RSB
data structure is hierarchical and allows cache-efficient and multi-threaded
(that is, shared memory-parallel) operations on large sparse matrices.
Its basic idea is a recursive subdivision of a matrix in the two dimensions.
The subdivision algorithm attempts to partition the matrix until the individual submatrices occupy approximately the amount of memory contained in the
CPU
caches.
The number of cores/thread available also plays a role.
The following picture shows the symmetric matrix
audikw_1 (943695 rows, 39297771 nonzeroes,
obtained from the
SuiteSparse Matrix Collection
and
partitioned for different numbers of executing threads (respectively 1,2,4 and 8).
Because of the symmetry, the upper triangle representation can be omitted.
The memory layout of the individual sparse submatrices follows a Z (or Morton) ordering.
Each submatrix is stored either in coordinate or CSR (Compressed Sparse Rows) format, and the numerical index type size is chosen in order to minimize memory consumption.
This layout enables efficient and multithreaded algorithms implementing the Sparse BLAS operations.
The algorithms and the performance of RSB in practice are described in a series of papers; give a look at the
papers
.
The following distributions or projects package librsb
, according to their own specific method:
Reference Documentation and README
The complete documentation in HTML format for the latest released 1.3 version is available here.
The same documentation is included in the
sources archive
and provides several examples for starting using librsb
.
For the old version 1.2, click here.
librsb
is efficient on large matrices;
that is, matrices whose memory occupation exceeds the memory cache.
On current machines, this boundary is around a few million of nonzeroes.
You might find librsb
slower
than a good CSR/CSC serial implementation if your matrices are smaller
than that; but also faster if larger than that.
Keep in mind that what contributes the RSB format in librsb
to be fast is:
symmetric matrices,
large matrices,
many right hand sides,
multiple cores.
You are welcome to
give feedback
about any performance related doubt, or provide a test case.
Building the librsb
library, C headers, documentation
Most numerical kernels code is auto-generated, and the supported numerical types can be chosen by the user at build time.
librsb
can also be built serially (without OpenMP parallelism), if required.
So there are many options to build a customized library (see documentation).
However, building with reasonable defaults can be as simple as:
./configure --help
./configure --enable-fortran-module-install --enable-matrix-types="double, double complex" \
CC=gcc CXX=g++ FC=gfortran CFLAGS=-O3 CXXFLAGS=-O3 FCFLAGS=-O3 --prefix=$HOME/local
make cleanall
make
make qtests
make tests
make install
Please note that if you intend to link librsb
to shared libraries (e.g.: in case of sparsersb
's sparsersb.oct
) then in the above you should specify flags to build position indipendent code, e.g.:
CFLAGS="-O3 -fPIC" FCFLAGS="-O3 -fPIC"
.
If you want to have a debug build with error verbosity activated, you may configure with something like:
./configure --enable-allocator-wrapper --enable-interface-error-verbosity="2" \
--enable-internals-error-verbosity="1" CFLAGS="-O0 -ggdb" CXXFLAGS="-O0 -ggdb" FCFLAGS="-O0 -ggdb"
Building the sparsersb
Package for GNU Octave
Assume you have both of librsb
and a recent GNU Octave (complete with development programs) installed.
Then you can build sparsersb
by:
more off; pkg -local -verbose install sparsersb-1.0.9.tar.gz
pkg load sparsersb
test sparsersb
help sparsersb
Alternatively, in the case you don't have a
librsb
installation at hand, you can have
sparsersb
use a
librsb
sources archive and build all in one go (linking statically):
setenv "LIBRSB_TARBALL" ~/librsb-1.3.0.0.tar.gz
more off; pkg -local -verbose install sparsersb-1.0.9.tar.gz
pkg load sparsersb
test sparsersb
help sparsersb
Download instructions for both packages are here.
If you think any information here is wrong or out of date, consider reporting; see the
contact section for this.
The sparsersb package
for
GNU Octave
allows you to manipulate sparse matrices using librsb
without the need of programming in C/C++ or Fortran.
You can reuse your GNU Octave programs employing sparse matrices, and instead of creating matrices using the "sparse
" keyword, use "sparsersb
", like in this example:
R=(rand(3)>.6)
A_octave=sparse(R)
A_librsb=sparsersb(R)
test sparsersb
help sparsersb
Matrices created in this way support transparently nearly all of the operations supported by native GNU Octave
sparse
(see the GNU Octave's Sparse Matrices documentation).
If you are using a multi-core CPU and your matrices are significantly large,
with
sparsersb
.
you shall get a much faster matrix-vector multiplication than with Octave's
sparse
.
See the build instructions section on how to build
sparsersb
and the download section on where to get it.
Let us consider a simple program creating a sparse matrix and multiplying it by a vector.
Create a source code program file (say, mysbc.c
) with a text editor:
#include <rsb.h>
#include <blas_sparse.h>
int main(const int argc, char * const argv[])
{
blas_sparse_matrix matrix = blas_invalid_handle;
const int nnzA=4;
const int nrA=3,ncA=3;
const int incX=1,incB=1;
int IA[]={ 0, 1, 2, 2 };
int JA[]={ 0, 1, 0, 2 };
double VA[]={ 11.0, 22.0, 13.0, 33.0 };
double X[]={ 0.0, 0.0, 0.0 };
double B[]={ -1.0, -2.0, -2.0 };
rsb_err_t errval = RSB_ERR_NO_ERROR;
if((errval = rsb_lib_init(RSB_NULL_INIT_OPTIONS)) != RSB_ERR_NO_ERROR) return -1;
matrix = BLAS_duscr_begin(nrA,ncA);
if(matrix==blas_invalid_handle) return -1;
if( BLAS_ussp(matrix,blas_lower_symmetric) != 0 ) return -1;
if( BLAS_duscr_insert_entries(matrix, nnzA, VA, IA, JA) != 0 ) return -1;
if( BLAS_duscr_end(matrix) == blas_invalid_handle ) return -1;
if( BLAS_dusmv(blas_no_trans,-1,matrix,B,incB,X,incX)) return -1;
if( BLAS_usds(matrix)) return -1;
if((errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS)) != RSB_ERR_NO_ERROR) return -1;
return 0;
}
To compile and build, execute in your shell:
export PATH=$PATH:$HOME/local/librsb/bin/
gcc `librsb-config --I_opts --cflags` -c mysbc.c
gcc -o mysbc mysbc.o `librsb-config --static --ldflags --extra_libs`
./mysbc
See the reference documentation for more complete examples.
The C++
<rsb.hpp>
header introduced in librsb-1.3 allows C++ programs like
mysbp.c
:
#include <rsb.hpp>
#include <vector>
#include <array>
int main() {
rsb::RsbLib rsblib;
const int nnzA { 7 }, nrhs { 2 };
const int nrA { 6 }, ncA { 6 };
const std::vector<int> IA {0,1,2,3,4,5,1};
const int JA [] = {0,1,2,3,4,5,0};
const std::vector<double> VA {1,1,1,1,1,1,2}, B(nrhs * ncA,1);
std::array<double,nrhs * nrA> C;
const double alpha {2}, beta {1};
rsb::RsbMatrix<double> mtx(IA,JA,VA,nnzA);
mtx.spmm(RSB_TRANSPOSITION_N, alpha, nrhs, RSB_FLAG_WANT_ROW_MAJOR_ORDER, B, beta, C);
}
and compile and build, execute in your shell:
export PATH=$PATH:$HOME/local/librsb/bin/
g++ -std=c++20 `librsb-config --I_opts --cxxflags` -c mysbp.cpp
g++ -o mysbp mysbp.o `librsb-config --static --ldflags --extra_libs`
./mysbp
See the reference documentation for more complete examples.
The same program
(say, mysbf.f90
)
in Fortran:
PROGRAM MAIN
USE rsb
USE blas_sparse
IMPLICIT NONE
INTEGER :: istat = 0
INTEGER :: A
INTEGER :: transt=blas_no_trans
INTEGER :: incX=1, incB=1
REAL(KIND=8) :: alpha=3
INTEGER :: nnzA=4, nrA=3, ncA=3
INTEGER :: IA(4)=(/1, 2, 3, 3/)
INTEGER :: JA(4)=(/1, 2, 1, 3/)
REAL(KIND=8) :: VA(4)=(/11.0, 22.0, 13.0, 33.0/)
REAL(KIND=8) :: X(3)=(/ 0, 0, 0/)
REAL(KIND=8) :: B(3)=(/-1.0, -2.0, -2.0/)
TYPE(C_PTR),PARAMETER :: EO = C_NULL_PTR
istat = rsb_lib_init(EO)
IF(istat.NE.0)STOP
CALL duscr_begin(nrA,ncA,A,istat)
CALL ussp(A,blas_lower_symmetric,istat)
CALL uscr_insert_entries(A,nnzA,VA,IA,JA,istat)
CALL uscr_end(A,istat)
CALL usmv(transt,alpha,A,B,incB,X,incX,istat)
CALL usds(A,istat)
istat = rsb_lib_exit(EO)
END PROGRAM MAIN
Note how default indexing in Fortran is 1-based.
To compile and build, execute in your shell:
export PATH=$PATH:$HOME/local/librsb/bin/
gfortran `librsb-config --I_opts --fcflags` -c mysbf.f90
gfortran -o mysbf mysbf.o `librsb-config --static --ldflags --extra_libs`
./mysbf
The same program
can be rewritten using the native RSB interface via the ISO C Binding feature:
PROGRAM MAIN
USE rsb
USE ISO_C_BINDING
IMPLICIT NONE
INTEGER,TARGET :: errval
INTEGER :: transt = RSB_TRANSPOSITION_N
INTEGER :: incX = 1, incB = 1
REAL(KIND=8),TARGET :: alpha = 3, beta = 1
INTEGER :: nnzA = 4, nrA = 3, ncA = 3
INTEGER,TARGET :: IA(4) = (/1, 2, 3, 3/)
INTEGER,TARGET :: JA(4) = (/1, 2, 1, 3/)
INTEGER :: typecode = RSB_NUMERICAL_TYPE_DOUBLE
INTEGER :: flags = RSB_FLAG_DEFAULT_MATRIX_FLAGS+RSB_FLAG_SYMMETRIC
REAL(KIND=8),TARGET :: VA(4) = (/11.0, 22.0, 13.0, 33.0/)
REAL(KIND=8),TARGET :: X(3) = (/ 0, 0, 0/)
REAL(KIND=8),TARGET :: B(3) = (/-1.0, -2.0, -2.0/)
TYPE(C_PTR),TARGET :: mtxAp=C_NULL_PTR
TYPE(C_PTR) :: mtxApp = C_NULL_PTR
errval = rsb_lib_init(C_NULL_PTR)
IF(errval.NE.0) STOP "error calling rsb_lib_init"
mtxAp = rsb_mtx_alloc_from_coo_begin(nnzA,typecode,nrA,ncA,flags,C_LOC(errval))
errval = rsb_mtx_set_vals(mtxAp,C_LOC(VA),IA,JA,nnzA,flags)
mtxApp = C_LOC(mtxAp)
IF(errval.NE.0) STOP "error calling rsb_mtx_set_vals"
errval = rsb_mtx_alloc_from_coo_end(mtxApp)
IF(errval.NE.0) STOP "error calling rsb_mtx_alloc_from_coo_end"
errval = rsb_spmv(transt,C_LOC(alpha),mtxAp,C_LOC(X),incX,C_LOC(beta),C_LOC(B),incB) ! X:=X+(3)*A*B
IF(errval.NE.0) STOP "error calling rsb_spmv"
mtxAp = rsb_mtx_free(mtxAp)
IF(errval.NE.0) STOP "error calling rsb_mtx_free"
errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS)
IF(errval.NE.0) STOP "error calling rsb_lib_exit"
PRINT *, "rsb module Fortran test is ok"
END PROGRAM MAIN
And linked the usual way.
Just make sure you enabled the Fortran headers installation at configure time.
The advantage of using this interface is that it matches exactly the C version; this applies to the documentation as well.
In a nutshell:
-
librsb-1.3
-
Improve SpMM performance
-
New native C++ API in
<rsb.hpp>
-
librsb-1.2
-
New autotuning mechanism, better performance
-
Many API changes: Fortran interface, improve EPS rendering, BLAS extensions, etc.
-
librsb-1.1
-
Introduced autotuning for SpMV, SpMM, SpSV, SpSM
-
Many API changes: deprecate old functions, render matrices, BLAS extensions, etc.
-
librsb-1.0
-
First public release, many operations in the RSB format
-
Sparse BLAS functionality
For a detailed list of user visible changes, see the
NEWS.TXT file.
To get announcements, subscribe to this low-traffic Mailing List.
Mailing list (low traffic) is at: http://sourceforge.net/p/librsb/mailman/, mostly for librsb
release announcements.
-
Michele Martone ( michelemartone
AT
users
DOT
sourceforge
DOT
net )
Please include "librsb" or "sparsersb" or "PyRSB" in the "Subject:" line of your emails.
-
On the
WWW:
https://michelemartone.github.io/.
-
Parties interested in its further development or wishing to support it are welcome to contact me.
-
Bug reports,
documentation clarifications,
notes how did
librsb
improve your work/research (and which type of problems does it solve) are all appreciated.